Overview

Dataset statistics

Number of variables13
Number of observations2969
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory290.1 KiB
Average record size in memory100.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with q_invoices and 5 other fieldsHigh correlation
recency_days is highly correlated with q_invoicesHigh correlation
q_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
q_items is highly correlated with gross_revenue and 5 other fieldsHigh correlation
q_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with q_productsHigh correlation
q_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.4442279) Skewed
q_returns is highly skewed (γ1 = 51.79774426) Skewed
avg_basket_size is highly skewed (γ1 = 44.68328098) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
q_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2022-10-26 13:43:51.803841
Analysis finished2022-10-26 13:44:15.570083
Duration23.77 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2317.292354
Minimum0
Maximum5715
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:15.678083image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.4
Q1929
median2120
Q33537
95-th percentile5035.2
Maximum5715
Range5715
Interquartile range (IQR)2608

Descriptive statistics

Standard deviation1554.944589
Coefficient of variation (CV)0.6710178739
Kurtosis-1.010787014
Mean2317.292354
Median Absolute Deviation (MAD)1271
Skewness0.342284058
Sum6880041
Variance2417852.674
MonotonicityStrictly increasing
2022-10-26T10:44:15.815933image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
30111
 
< 0.1%
29961
 
< 0.1%
29991
 
< 0.1%
30001
 
< 0.1%
30011
 
< 0.1%
30021
 
< 0.1%
30051
 
< 0.1%
30071
 
< 0.1%
30081
 
< 0.1%
Other values (2959)2959
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57151
< 0.1%
56961
< 0.1%
56861
< 0.1%
56801
< 0.1%
56591
< 0.1%
56551
< 0.1%
56491
< 0.1%
56381
< 0.1%
56371
< 0.1%
56271
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.77299
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-10-26T10:44:16.183028image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.4
Q113799
median15221
Q316768
95-th percentile17964.6
Maximum18287
Range5940
Interquartile range (IQR)2969

Descriptive statistics

Standard deviation1718.990292
Coefficient of variation (CV)0.1125673398
Kurtosis-1.206094692
Mean15270.77299
Median Absolute Deviation (MAD)1488
Skewness0.03160785866
Sum45338925
Variance2954927.624
MonotonicityNot monotonic
2022-10-26T10:44:16.327035image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
175881
 
< 0.1%
149051
 
< 0.1%
161031
 
< 0.1%
146261
 
< 0.1%
148681
 
< 0.1%
182461
 
< 0.1%
171151
 
< 0.1%
166111
 
< 0.1%
159121
 
< 0.1%
Other values (2959)2959
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2954
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2749.226056
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:16.487027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.77
Q1570.96
median1086.92
Q32308.06
95-th percentile7219.68
Maximum279138.02
Range279131.82
Interquartile range (IQR)1737.1

Descriptive statistics

Standard deviation10580.4905
Coefficient of variation (CV)3.848534202
Kurtosis353.9585684
Mean2749.226056
Median Absolute Deviation (MAD)672.72
Skewness16.77787915
Sum8162452.16
Variance111946779.3
MonotonicityNot monotonic
2022-10-26T10:44:16.632031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178.962
 
0.1%
533.332
 
0.1%
889.932
 
0.1%
2053.022
 
0.1%
745.062
 
0.1%
379.652
 
0.1%
2092.322
 
0.1%
731.92
 
0.1%
1353.742
 
0.1%
3312
 
0.1%
Other values (2944)2949
99.3%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140438.721
< 0.1%
124564.531
< 0.1%
117375.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.28864938
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:16.961552image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75617089
Coefficient of variation (CV)1.209485215
Kurtosis2.778038567
Mean64.28864938
Median Absolute Deviation (MAD)26
Skewness1.798396863
Sum190873
Variance6046.022112
MonotonicityNot monotonic
2022-10-26T10:44:17.097208image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
966
 
2.2%
766
 
2.2%
1764
 
2.2%
2255
 
1.9%
Other values (262)2219
74.7%
ValueCountFrequency (%)
034
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.4%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

q_invoices
Real number (ℝ≥0)

HIGH CORRELATION

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.72280229
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:17.232344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.85665393
Coefficient of variation (CV)1.547607882
Kurtosis190.8253633
Mean5.72280229
Median Absolute Deviation (MAD)2
Skewness10.76645634
Sum16991
Variance78.44031883
MonotonicityNot monotonic
2022-10-26T10:44:17.354345image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2786
26.5%
3498
16.8%
4393
13.2%
5237
 
8.0%
1190
 
6.4%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
Other values (46)332
11.2%
ValueCountFrequency (%)
1190
 
6.4%
2786
26.5%
3498
16.8%
4393
13.2%
5237
 
8.0%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

q_items
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1665
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1606.461098
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:17.480344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.4
Q1296
median639
Q31399
95-th percentile4407.4
Maximum196844
Range196843
Interquartile range (IQR)1103

Descriptive statistics

Standard deviation5882.976527
Coefficient of variation (CV)3.6620722
Kurtosis467.153716
Mean1606.461098
Median Absolute Deviation (MAD)420
Skewness17.87844459
Sum4769583
Variance34609412.81
MonotonicityNot monotonic
2022-10-26T10:44:17.605351image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
889
 
0.3%
1509
 
0.3%
2608
 
0.3%
848
 
0.3%
2888
 
0.3%
2728
 
0.3%
2468
 
0.3%
5167
 
0.2%
3947
 
0.2%
Other values (1655)2886
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
251
< 0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
799631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
628121
< 0.1%
582431
< 0.1%
577851
< 0.1%

q_products
Real number (ℝ≥0)

HIGH CORRELATION

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.705288
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:17.738352image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.8419967
Coefficient of variation (CV)2.199106503
Kurtosis354.8373546
Mean122.705288
Median Absolute Deviation (MAD)44
Skewness15.70613971
Sum364312
Variance72814.70321
MonotonicityNot monotonic
2022-10-26T10:44:17.861358image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2845
 
1.5%
2038
 
1.3%
3535
 
1.2%
1533
 
1.1%
2933
 
1.1%
1933
 
1.1%
1132
 
1.1%
2631
 
1.0%
2730
 
1.0%
2529
 
1.0%
Other values (459)2630
88.6%
ValueCountFrequency (%)
16
 
0.2%
214
0.5%
316
0.5%
417
0.6%
526
0.9%
629
1.0%
718
0.6%
819
0.6%
927
0.9%
1027
0.9%
ValueCountFrequency (%)
78371
< 0.1%
56701
< 0.1%
50951
< 0.1%
45771
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16361
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct2966
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.90005685
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:17.987357image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.916661099
Q113.11933333
median17.97438356
Q324.98828571
95-th percentile90.497
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.86895238

Descriptive statistics

Standard deviation1036.934336
Coefficient of variation (CV)19.9794451
Kurtosis2890.70744
Mean51.90005685
Median Absolute Deviation (MAD)5.994222271
Skewness53.4442279
Sum154091.2688
Variance1075232.818
MonotonicityNot monotonic
2022-10-26T10:44:18.112360image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152
 
0.1%
4.1622
 
0.1%
14.478333332
 
0.1%
18.152222221
 
< 0.1%
13.927368421
 
< 0.1%
36.244117651
 
< 0.1%
29.784166671
 
< 0.1%
22.87926231
 
< 0.1%
20.511041671
 
< 0.1%
149.0251
 
< 0.1%
Other values (2956)2956
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.35143043
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:18.239358image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.92857143
median48.28571429
Q385.33333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.4047619

Descriptive statistics

Standard deviation63.54282948
Coefficient of variation (CV)0.9434518178
Kurtosis4.887703174
Mean67.35143043
Median Absolute Deviation (MAD)26.28571429
Skewness2.062908983
Sum199966.397
Variance4037.691178
MonotonicityNot monotonic
2022-10-26T10:44:18.371753image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1425
 
0.8%
422
 
0.7%
7021
 
0.7%
720
 
0.7%
3519
 
0.6%
4918
 
0.6%
2117
 
0.6%
4617
 
0.6%
1117
 
0.6%
116
 
0.5%
Other values (1248)2777
93.5%
ValueCountFrequency (%)
116
0.5%
1.51
 
< 0.1%
213
0.4%
2.51
 
< 0.1%
2.6013986011
 
< 0.1%
315
0.5%
3.3214285711
 
< 0.1%
3.3303571431
 
< 0.1%
3.52
 
0.1%
422
0.7%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3621
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1350
Distinct (%)45.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06327172298
Minimum0.005449591281
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:18.506753image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.009433962264
Q10.01777777778
median0.02941176471
Q30.05540166205
95-th percentile0.2222222222
Maximum3
Range2.994550409
Interquartile range (IQR)0.03762388427

Descriptive statistics

Standard deviation0.1344819335
Coefficient of variation (CV)2.125466593
Kurtosis121.5596918
Mean0.06327172298
Median Absolute Deviation (MAD)0.01433823529
Skewness8.773426515
Sum187.8537455
Variance0.01808539044
MonotonicityNot monotonic
2022-10-26T10:44:18.645414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.166666666721
 
0.7%
0.333333333321
 
0.7%
0.0277777777820
 
0.7%
0.0909090909119
 
0.6%
0.062517
 
0.6%
0.133333333316
 
0.5%
0.416
 
0.5%
0.2515
 
0.5%
0.0238095238115
 
0.5%
0.0357142857115
 
0.5%
Other values (1340)2794
94.1%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055096418731
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
31
 
< 0.1%
21
 
< 0.1%
1.5714285711
 
< 0.1%
1.53
 
0.1%
114
0.5%
0.83333333331
 
< 0.1%
0.751
 
< 0.1%
0.666666666712
0.4%
0.65147453081
 
< 0.1%
0.61
 
< 0.1%

q_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.1569552
Minimum0
Maximum80995
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:18.790467image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100.6
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1512.496135
Coefficient of variation (CV)24.33349783
Kurtosis2765.52864
Mean62.1569552
Median Absolute Deviation (MAD)1
Skewness51.79774426
Sum184544
Variance2287644.557
MonotonicityNot monotonic
2022-10-26T10:44:18.941541image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
678
 
2.6%
561
 
2.1%
1251
 
1.7%
843
 
1.4%
743
 
1.4%
Other values (204)706
23.8%
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
561
 
2.1%
678
 
2.6%
743
 
1.4%
843
 
1.4%
941
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1973
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.349541
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:19.077540image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.25
median172
Q3281.5
95-th percentile599.52
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.25

Descriptive statistics

Standard deviation791.5024106
Coefficient of variation (CV)3.174268569
Kurtosis2256.245507
Mean249.349541
Median Absolute Deviation (MAD)82.75
Skewness44.68328098
Sum740318.7873
Variance626476.066
MonotonicityNot monotonic
2022-10-26T10:44:19.205540image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
739
 
0.3%
869
 
0.3%
829
 
0.3%
1368
 
0.3%
608
 
0.3%
758
 
0.3%
888
 
0.3%
717
 
0.2%
Other values (1963)2882
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1010
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.15507374
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-10-26T10:44:19.339539image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.345454545
Q110
median17.2
Q327.75
95-th percentile56.94
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.51303316
Coefficient of variation (CV)0.8807478316
Kurtosis27.69469772
Mean22.15507374
Median Absolute Deviation (MAD)8.2
Skewness3.498252107
Sum65778.41393
Variance380.7584629
MonotonicityNot monotonic
2022-10-26T10:44:19.464047image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1353
 
1.8%
1440
 
1.3%
1138
 
1.3%
2033
 
1.1%
933
 
1.1%
132
 
1.1%
1831
 
1.0%
1030
 
1.0%
1629
 
1.0%
1728
 
0.9%
Other values (1000)2622
88.3%
ValueCountFrequency (%)
132
1.1%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5681818181
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
224
0.8%
ValueCountFrequency (%)
299.70588241
< 0.1%
2591
< 0.1%
203.51
< 0.1%
1481
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1271
< 0.1%
1221
< 0.1%
1181
< 0.1%

Interactions

2022-10-26T10:44:13.550629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:54.780694image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.532951image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.253950image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.842585image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.295024image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.644873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.280384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.749897image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.257409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.774411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.497754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.043881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.696629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:54.932599image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.727113image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.360285image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.952707image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.396025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.753875image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.391384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.852898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.367411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.895416image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.615755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.157677image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.849629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.041728image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.897273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.466285image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.062770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.494025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.863383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.504384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.954897image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.501413image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.214110image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.737762image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.272628image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.992630image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.191765image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.031507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.567290image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.171011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.594024image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.971383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.612384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.061898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.621410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.323112image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.881560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.381958image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.111913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.310452image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.164569image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.679400image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.287838image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.699359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.088382image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.725384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.197898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.734410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.437111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.996560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.495960image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.214928image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.422489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.297567image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.778073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.392847image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.795358image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.193387image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.829385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.308410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.836411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.540165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.102569image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.600472image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.331929image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.575101image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.428855image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.898396image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.512577image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.908872image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.312386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.949386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.426409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.952411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.656168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.223560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.720475image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.467943image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.726103image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.547858image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.167418image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.627577image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.016871image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.587383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.064384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.569411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.069411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.789168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.345559image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.840472image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.597034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.847528image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.668377image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.274886image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.732578image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.112871image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.697385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.172385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.674409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.207410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:09.918165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.457672image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:12.948479image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.710034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:55.970772image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.797049image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.386736image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.844151image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.217870image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.813385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.285385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.782409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.325410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.027166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.573672image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.064481image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.818033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.085561image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:57.910792image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.497895image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:00.952173image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.317871image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:03.925385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.402386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:06.909410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.433411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.139167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.686675image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.183480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:14.937034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.226965image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.029958image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.611754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.067899image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.430871image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.044386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.518386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.037411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.548414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.261204image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.804882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.312489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:15.053034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:56.384189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:58.145005image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:43:59.727790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:01.187018image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:02.541874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:04.166383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:05.638384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:07.150411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:08.663411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:10.383730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:11.926882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-26T10:44:13.432630image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-10-26T10:44:19.577049image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-10-26T10:44:19.754066image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-26T10:44:19.940064image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-26T10:44:20.125066image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-26T10:44:20.352394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-26T10:44:15.226080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-26T10:44:15.470081image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysq_invoicesq_itemsq_productsavg_ticketavg_recency_daysfrequencyq_returnsavg_basket_sizeavg_unique_basket_size
00178505391.2100372.000034.00001733.0000297.000018.152235.50000.486140.000050.97068.7353
11130473232.590056.00009.00001390.0000171.000018.904027.25000.048835.0000154.444419.0000
22125836705.38002.000015.00005028.0000232.000028.902523.18750.045750.0000335.200015.4667
3313748948.250095.00005.0000439.000028.000033.866192.66670.01790.000087.80005.6000
4415100876.0000333.00003.000080.00003.0000292.00008.60000.136422.000026.66671.0000
55152914623.300025.000014.00002102.0000102.000045.326523.20000.054429.0000150.14297.2857
66146885630.87007.000021.00003621.0000327.000017.219818.30000.0736399.0000172.428615.5714
77178095411.910016.000012.00002057.000061.000088.719835.70000.039141.0000171.41675.0833
881531160767.90000.000091.000038194.00002379.000025.54354.14440.3155474.0000419.714326.1429
99160982005.630087.00007.0000613.000067.000029.934847.66670.02440.000087.57149.5714

Last rows

df_indexcustomer_idgross_revenuerecency_daysq_invoicesq_itemsq_productsavg_ticketavg_recency_daysfrequencyq_returnsavg_basket_sizeavg_unique_basket_size
29595627177271060.250015.00001.0000645.000066.000016.06446.00000.28576.0000645.000066.0000
2960563717232421.52002.00002.0000203.000036.000011.708912.00000.15380.0000101.500018.0000
2961563817468137.000010.00002.0000116.00005.000027.40004.00000.40000.000058.00002.5000
2962564913596697.04005.00002.0000406.0000166.00004.19907.00000.25000.0000203.000083.0000
29635655148931237.85009.00002.0000799.000073.000016.95682.00000.66670.0000399.500036.5000
2964565912479473.200011.00001.0000382.000030.000015.77334.00000.333334.0000382.000030.0000
2965568014126706.13007.00003.0000508.000015.000047.07533.00001.000050.0000169.33335.0000
29665686135211092.39001.00003.0000733.0000435.00002.51124.50000.30000.0000244.3333145.0000
2967569615060301.84008.00004.0000262.0000120.00002.51531.00002.00000.000065.500030.0000
2968571512558269.96007.00001.0000196.000011.000024.54186.00000.2857196.0000196.000011.0000